Great tool for a quick examination of a program.  Only tool if you do not
have access to the source.  Useful for a sanity check of the appilcation.
Is the programmer telling the truth about their code. 

o GLdebug can be used both to debug and to tune:
    - tells you what graphics calls are being issued 
    - look for lots of mode changes, or unnecesssary mode settings
    - verify subpixel(1); glcompat(GLC_OLDPOLYGON, FALS);
    - check on:
	shademodel(FLAT/GOURAUD)
	infinite vs lights or LOCALVIEWER set in the Light Model
	two sided lighting
	mode changes: 
	    frequent calls to shademodel, zbuffer, blendfunction	
	    use of lmdef instead of lmcolor
    - check for duplicate data 
	(often seen in normals and colors with flat-shading)
    - or unnecessary vertex bindings,
        such as uneeded per-vertex colors or normals for a flat shaded object.
    * must be single process to run gldebug
    * using ignore files will simplify gldebug output
 

o What warning message are printed?

Explanation of Options:
-----------------------
     -h              no history output.
[run this option when you only want to see use the Stateviewer]

     -w              no warning output.

     -e              no error output.

     -f              no fatal error output.


     -c              do not run Controller.

     -s              do not run Stateviewer.
[run these options when you only want to see the output history, i.e. when
you are looking for known bad habits which degrade performance.  It may be
useful to generate a history file in one pass then run the Stateviewer while 
examining the output.] 

     -C              generate C code in history file.
[this is not very useful as the code never looks like the application.
One may be able to reconstruct a bug without copying an unmanagably large
size of application code.  Also useful for producing a benchmark of the code
in the application.  This does not produce code which will compile.]

     -F              flush output buffer to history file after each GL call.

     -p wait         profile (output the number of times each GL function is
                     called).  wait is the number of GL calls wait between
                     each profile write to file.  Profile output goes to
                     GLdebug.count.

     -i filename     ignore the GL functions listed in filename when writing
                     output.  filename should contain GL function names listed
                     one per line.
[very useful for supressing commands which carry lots of data like texdef, 
defpattern, v3f, etc.]

     -o filename     send history trace output to filename. Default is
                     GLdebug.history.

     -O              send history trace output to stdout. This overrides -o
                     filename.
[not very useful, history files are always big]


o Useful alias: gldebug -i ~/gldignore -sF
    gldignore:
    -----------
    qread
    getmatrix
    defpattern
    texdef

Taking a GLdebug Trace:
-----------------------
    gldebug session to grab one frame:
        - start up gldebug
        - turn of output and breakpoints
        - set breakpoint at swapbuffers
        - go to interesting frame
        - turn on breakpoints
        > will stop at swapbuffers
        - turn on output
        - continue
        > will stop at swapbuffers, outputing one full scene to GLdebug.history
        - quit and look at output
        
        * note:  grabbing one  frame will show stuff set that frame but will
            not reflect modes that were set previously.
            therefore, it is best ot have a program that can come up in the
            desired location and with the desired modes and then grab
            the first 2 frames: 1 for initialization, one for
            continued drawing.


EXAMPLE:
--------

> Vince,
> 
>     Here is a chunck of the output.  Note that a number of different 
> techniques are used for drawing the models within the scene.  So this
> is only representative of a subset of the drawing (e.g. I don't even
> know if any of the models in this section have textures turned on).
> 

While working with Frank he thought that their code was finely tuned for
VGX.  He said something about a team of programmers working on the code
for 10 man years.  At first i had little  confidence that we could improve his
code, but i think we have found room for improvement.  

First as you both know if you move an app from the VGX to RE and see no 
improvement it probably means that you have a CPU bottleneck or something
really stupid is being done in the graphics code.  Unfortunately, we do not 
know what all of those "stupid" things are on RE yet.  
Also, in the demo that he is running there was no texture mapping.  
i am not surprise to learn that there was little improvement for non-texture
mapped primitives.  For standard phong lighted, Z-buffered, non-textured 
primitives the performance is about the same.  The flat-shaded tmesh 
performance is exactly the same.  The Gouraud shaded tmesh performance is 
about 10% higher on a RE.
The biggest improvement comes with independent Gouraud shaded quads, about 33%.
Turn on texturing and you get a big win.

i helped Frank generate one frame of gldebug output and asked him to send
me the file.  A quick glance at the data reveals 3 sets of superfluous calls 
to the GL only 2 of which could impact performance.  The improvements made here
should result in improved performance on both VGX and RE since it will reduce 
the CPU bottleneck.

	1) n3f
	2) lmbind
	3) misc.
	
1) The biggest problem is with duplicated normals. One trick to remember is that
the hardware caches the normal and provides a copy with any 
subsequent vertexes that are sent without normals while lighting is enabled.
If you look at the tmeshes you will notice 50 - 90+ % of the normal data
is duplicated.  Note i suppressed the gldebug output of v3f commands. 
If you look at the first FLAT shaded tmesh which has 12 vertexes you will 
see that 12 identical normals are being copied.  That is 50% more data than
necessary.  Since lighting was enabled the same normal was also transformed 
for each copy.  Furthermore, if multiple objects share the same normal it need 
only be sent once.  This change may require rebuilding of the database.

2) It appears that every new lmbind call is preceded by a call to 
lmbind(MATERIAL, 0) which disables lighting.  This is only necessary if they
wish to draw an unlighted object.  This is inefficeint toggling of modes.  The
RealityEngine is very sensitive to mode changes.  Remove that lmbind.

3) There appear to be calls to things that are never used.
e.g. getmatrix(), getpattern(), the query calls can be expensive because
they are copying data back to the host or often have to go into feedback mode.

Finally, it would be helpful to see some prof/pixie output from their program
to verify this.  If we are truely experiencing a CPU botteneck then you should
see gl_i_v3f and gl_i_n3f listed at the very top of the pixie readings.

good luck and i hope this helps,
vince
 
> getpattern();
> getmatrix(OUT);
> lmbind(MATERIAL, 0);
> lmbind(MATERIAL, 5);
> shademodel(GOURAUD);
> bgntmesh();
> n3f({1.000000, 0.000000, 0.000000});
> n3f({1.000000, 0.000000, 0.000000});
> n3f({0.500000, 0.797443, -0.337763});
> n3f({0.500000, 0.797443, -0.337763});
> n3f({-0.500000, 0.797443, -0.337763});
> n3f({-0.500000, 0.797443, -0.337763});
> n3f({-1.000000, 0.000000, 0.000000});
> n3f({-1.000000, 0.000000, 0.000000});
> n3f({-0.500000, -0.797443, 0.337763});
> n3f({-0.500000, -0.797443, 0.337763});
> n3f({0.500000, -0.797443, 0.337763});
> n3f({0.500000, -0.797443, 0.337763});
> n3f({1.000000, 0.000000, 0.000000});
> n3f({1.000000, 0.000000, 0.000000});
> endtmesh();

> lmbind(MATERIAL, 0);
> lmbind(MATERIAL, 6);
> shademodel(FLAT);
> bgntmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> swaptmesh();
> n3f({-1.000000, 0.000000, 0.000000});
> endtmesh();